Genetic diversity, asexual reproduction and conservation of the edible fruit tree Spondias purpurea L. (Anacardiaceae) in the Costa Rican tropical dry forest

The term circa situm has been used to describe different conservation strategies within agricultural landscapes. Circa situm conserves planted or remnant species in farmlands, where natural vegetation has been modified through anthropogenic intervention. It has been proposed that trees planted or retained under circa situm conditions may contribute to maintaining genetic diversity, however information on the role of this strategy in preserving genetic diversity is scarce. The aim of this study was to determine the levels of genetic diversity and structure, and mating patterns in planted and unmanaged stands of the tropical fruit tree Spondias purpurea L. in north western Costa Rica. In three localities, we used seven polymorphic microsatellite loci and genotyped 201 adults and 648 seeds from planted and wild stands. We found no differences in genetic diversity among planted and wild stands. Genetic structure analysis revealed that gene flow occurs among planted and wild stands within localities. Clones were present and their diversity and evenness were both high and similar between planted and wild stands. The number of pollen donors per progeny array was low (Nep = 1.01) which resulted in high levels of correlated paternity (rp = 0.9). Asexual seeds were found in 4.6% of the progeny arrays, which had multilocus genotypes that were identical to the maternal trees. Our results show that although planted stands under circa situm conditions can maintain similar levels of genetic diversity than wild stands, the low number of sires and asexual seed formation could threaten the long term persistence of populations.


Introduction
Tropical dry forests (TDFs) constitute one of the most important reservoirs of biodiversity in the tropics, with high levels of species richness and endemism [1][2][3][4][5]. TDFs are also socially a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 and economically important because a considerable number of species found in TDFs are used by humans for food, firewood, fodder, medicine, construction, live fencing, ornaments and crafts, rituals, and leather tanning [6][7][8]. This utilitarian value of TDFs as sources of timber and non-timber products has been widely recognized as an argument for their conservation and sustainable management [9][10][11]. However, despite the high biodiversity of TDFs, they are one of the most threatened ecosystems in the tropics [12,13], with less than 10% of their original extent remaining [13,14]. Human preference for the seasonally dry tropical environment [15], the ease of clearing its vegetation and suppress future regeneration with fire [3], have led to the destruction and fragmentation of this habitat [16].
In north western Costa Rica, TDFs have been reduced to a series of small forest patches surrounded by large cattle fields or cultivated areas (e.g., sugar cane and rice fields). These forest patches shelter only 0.1% of the original TDF cover in Costa Rica [3,17]. Several studies have demonstrated that habitat loss and fragmentation negatively affect the reproductive success and genetic diversity of tropical trees, compromising the long term survivorship of populations [18][19][20][21][22]. However, isolated trees that are maintained by farmers, can serve as "stepping stones" between forest patches and, as a result, play a critical role in gene movement and connectivity among tropical forest fragments [23], significantly contributing to the propagule pool of remnant forests [24]. In addition, trees growing among farmlands and in neighboring natural forest fragments may also play an important role maintaining populations of insects, birds and mammals needed for crop pollination, biological pest control, and increasing crop productivity [25][26][27]. In agricultural landscapes, this arrangement of trees growing within farmlands and in remnant forest fragments, allow farmers to maintain crop production while indirectly conserving species richness and genetic diversity in what can be interpreted as circa situm conservation [28].
The term circa situm has been used to describe different conservation strategies within agricultural landscapes (e.g., agroforestry systems, home gardens, living fences, urban amenities) outside natural habitats, but within the native range of the species [16]. Generally, circa situm conserves planted or remnant trees in farmlands or forest patches where natural forests or woodlands containing the same trees were once found; but where natural vegetation has been lost or modified significantly through anthropogenic intervention [28]. Circa situm plantations are not normally created for conservation [16], but rather tree stands are planted to provide resources such as food, fibers, medicine, live fences, and edibles among others [28]; or more recently to provide amenities and comfort in urban parks and streets [29]. Indirectly, they allow for the conservation of different organisms. However, circa situm plantations need to be sufficiently close to wild populations, so that pollen flow (i.e., gene flow) from wild individuals into planted stands is still possible, securing fruit production and reducing the loss of genetic diversity [28,[30][31][32][33].
Spondias purpurea L. is a dioecious fruit tree domesticated in the Mesoamerican region [34,35] known locally as "jocote", "ciruelo" or "abal". Jocotes were domesticated for their plumlike fruits, which are collected from rustic plantings and sold fresh in local markets or made into jams and beverages [34,36,37]. Spondias purpurea is pollinated by social bees and wasps and fruits are dispersed by small mammals and birds [22]. During fructification, wild S. purpurea fruits are easily distinguished by their fruits which are usually smaller, more acidic and less fleshy than cultivated fruits [35,38]. In Mesoamerica, S. purpurea like other perennial domesticated species is generally vegetatively propagated by stem cuttings in homegardens and living fences, frequently in close proximity to wild stands (unmanaged trees), that occur in secondary forests and TDF's remnants [35]. Natural populations of this fruit tree face the negative reproductive and genetic consequences of TDF loss and fragmentation [22], however information on the role of cultivated trees in the maintenance of its genetic diversity is lacking. In other species, research has shown that cultivated individuals can help maintain genetic diversity in managed landscapes [30][31][32]. Female S. purpurea trees are more likely to be vegetatively propagated in circa situm and along live fences for fruit consumption. Fruit production can be achieved via sexual reproduction, if male trees from nearby populations act as pollen sources or via asexual reproduction (i.e., apomixis) [37]. Planted stands, predominantly composed of female trees, receive less pollen from fewer donors, because male S. purpurea flowers offer pollen and nectar, while female flowers only offer nectar as a reward, thus females are less likely to attract pollinators [22]. This may result in a higher proportion of asexual fruit production in planted stands [37], and fruits produced via sexual reproduction, are expected to have a lower number of sires compared to fruits from wild stands.
Our study took place in the north west pacific region of Costa Rica, where wild stands of S. purpurea in secondary forests and isolated trees are in close proximity to plantings along live fences, small farms or in homegardens. The objectives of our study were (1) to compare genetic diversity and structure among wild and planted stands of S. purpurea (2) to estimate mating patterns in terms of correlated paternity of wild and planted stands of S. purpurea, and (3) to determine the frequency of sexual and asexual seeds produced in wild and planted stands of S. purpurea. This will allow us to analyze the role of planted stands for circa situm conservation of genetic diversity in this tropical dioecious species.

Study area and sampling
The study was conducted in north western Costa Rica in the province of Guanacaste. The study region consists primarily of TDF with a mean annual rainfall of 1600 mm and a marked dry season that extends from December to May [21]. The majority of the TDF in the study area was destroyed by the timber and cattle industry in the second half of the twentieth century, and has been converted to pastures and agricultural fields [17]. Presently, most S. purpurea individuals may be found in rustic plantations, propagated vegetatively for fruit consumption or found in natural habitats. Planted and wild stands of S. purpurea were located in three localities: Agua Caliente (AC), Murciélago (MU) and Horizontes Forestal Research Station (HO) (Fig 1). The three study sites are all TDFs that differ in their land use. Agua Caliente (AC) and Murciélago (MU) are disturbed habitats mainly composed of remnant forest patches and isolated trees surrounded by an agricultural matrix (Fig 1). Along both sites planted trees are found in backyard gardens and living fences, while wild trees grow within small forest remnants and in pasture lands as isolated trees (Fig 1). HO is a 7317 ha managed secondary forest. In the past, large portions of HO were used for rice, cotton, and sorghum production, as well as cattle grazing. However, over three decades ago the site was converted to an experimental station and was allowed to regenerate [39,40]. At HO planted stands are trees vegetatively propagated in backyard gardens near station facilities, wild trees are growing in a forest remnant along a brook and in secondary forests that are part of the natural regeneration program of the station (Fig 1). Within each study site, we located, marked and mapped all adult S. purpurea trees. Sex ratio per site was expressed as the proportion of males in the population: males / (females + males) [41]. To determine deviations from 1:1 sex ratio we performed a goodness of fit G-test for all sites using the stats library implemented in the R [42]. We collected fresh leaf tissue from adults (male and female trees) and stored them at -20˚C until DNA extraction. To estimate mating patterns and correlated paternity of the progeny, we collected 15 fruits per female tree, directly from the canopy of 20 female trees both in planted and wild stands. We dissected fruits and sampled diploid embryos for DNA extraction. The permits to access and collect within the protected areas of the study were granted by the Research Program of the Á rea de Conservación Guanacaste (AGC) to the researchers of this study (Permit number: ACG-023-2018).

DNA extraction and microsatellite amplification
DNA from leaves and embryos was extracted using a modified Cetylmethylammonium Bromide (CTAB) protocol [43]. Ten microsatellites previously developed for S. purpurea [44], were amplified via multiplex PCR using QIAGEN Multiplex kit (QIAGEN, Hilden, Germany) in 12 μL reaction volumes. Multiplex PCR amplification conditions followed Cristóbal-Pérez et al., [38] and fragments were analyzed on an automatic ABI-PRISM 3100-Avant sequencer (APPLIED BIOSYSTEMS, Carlsbad, California, USA), using GeneScan LIZ 600 (APPLIED BIOSYSTEMS) to determine fragment sizes. Alleles were scored manually using GeneMarker Software version 2.6.4 (SOFTGENETICS). To reduce genotyping error, all samples were independently scored by two different researchers to reach a consensus in the final data set. In addition the software MICROCHECKER [45] was used to detect the presence of null alleles and large allelic dropout across all loci. Three loci (SPUR41, SPUR35 and SPUR39) that were monomorphic in all populations were excluded from further analyses.

Genetic data analysis
A total of 201 adult individuals (46 planted and 155 wild) and 600 seeds (281 planted and 319 wild) were sampled and genotyped in the three study sites. Genetic diversity was quantified for planted and wild stands using the following parameters: allele number averaged across loci (Na), allelic richness (Ar), observed (Ho) and expected heterozygosities (He), and inbreeding coefficients (F). Allelic richness was estimated by rarefaction of alleles as implemented in the hierfstat library [46]. All other diversity parameters were calculated using the library poppr [47] implemented in R [42]. All parameters were calculated using the complete data set and a data set excluding repeated matching multilocus genotypes (i.e., clones). Differences in genetic diversity parameters (H e , H o and F) between planted and wild stands were analyzed using 10, 000 permutations to test significance in GenoDive v.3.04 [48]. A permutation test was also used to test if inbreeding coefficients differed significantly from zero. We used GenAlEx v.6.5 [49,50] to estimate the probability of identity (PI), which estimates the probability of randomly sampling identical genotypes. In order to identify individuals that share the same multilocus genotypes (genets) we used the software GENODIVE v.3.04 [48]. Individuals with the same multilocus genotypes were assigned to the same genet. Genotypes with missing data were ignored [48]. After we calculated the number of unique multilocus genotypes, Simpson's genotypic diversity and genotypic evenness indexes were estimated. The Simpson genotypic index [48], was calculated as: where p i are allele frequencies. This index provides an estimation of the probability that two randomly selected genotypes are different and scales from 0 (no genotypes are different) to 1 (all genotypes are different). Genotypic evenness is a measure of the distribution of genotype abundances, which takes values from 0 to 1 [48]. A genotypic evenness of 1 indicates that in a population all genotypes are equally abundant; while an evenness closer to zero is expected for a population dominated by a single unique genotype [48]. The genotype diversity was estimated by using R = G − 1/N − 1, where G is the number of distinct genotypes identified and N is the number of shoots analyzed; R will always be zero for a single clone stand and one for maximal genotypic diversity, when every sampled unit is a new genet [51].
To explore the overall genetic structure of planted and wild stands, we used the software STRUCTURE [52,53] to determine the best configuration of samples into K clusters based on similarity in allele frequencies, and possible admixture among clusters. We used the admixture model with correlated allele frequencies, with 250 000 MCMC chains and a burning of 25 000 chains. We estimated the likelihood of each configuration for K between 2 and 8, using 15 replicates for each K value. StructureHarvester v0.6.94 [54] was used to determine the most likely number of K clusters, and the Q-matrices were merged using the Full-Search Algorithm implemented in CLUMPP 1.1 [55]. CLUMPP's output was visualized using the popHelper library [56]. We estimated differences in allele frequencies among planted and wild stands using Nei's G ST statistic [57], with 1000 permutations to test for significance using GenoDive v.3.04. We estimated differences using all the genotyped individuals, and also on a reduced dataset with a single individual per clone. Relatedness of individual trees was estimated for the planted and wild stands using the Loiselle's kinship coefficient (F IJ ) [58], estimated in GenoDive v.3.04. This coefficient is based on the relative probability of identity by descent of the alleles within the two compared individuals [58].
Multilocus correlation of paternity (r pm ) was estimated using MLTR [59]. Correlated paternity is a measure of the proportion of pairs of outcrossed siblings that are full siblings. The standard error of the estimates was calculated by bootstrapping with 1 000 repetitions. We estimate the average effective number of pollen donors per maternal plant (N ep ) as the reciprocal of the r pm [59].

Results
We did not find statistical differences between planted and wild individuals for observed and expected heterozygosities, nor for inbreeding coefficients (Table 1, S1 and S3 Tables in S1 Data). MICROCHEKER did not find any evidence of null alleles, nor allelic dropout for all analyzed loci. The probability of identity (PI) estimated using all seven loci was PI = 0.00003. The proportion of distinguishable genets based on multilocus genotypes in planted trees (R = 0.74) was similar to wild trees (R = 0.77) ( Table 2). Clonal diversity was high and similar between planted (D = 0.90) and wild trees (D = 0.96). Clonal evenness measures were also similar between planted (E = 0.70) and wild trees (E = 0.62) ( Table 2). Genetic diversity estimates did not deviate significantly from our initial estimates, when clones were removed (S2 Table in S1 Data). Sex ratios were female biased (G = 12.71, p < 0.001) for AC and MU, but did not deviate significantly from 1:1 ratios in HO (S4 Table in S1 Data) (p>0.05).
StructureHarvester suggested that K = 3 clusters was the most likely configuration for all adult individuals. When individuals were grouped into K = 3 clusters, the observed division did not fully correlate with the division into wild and planted stands. Individuals in three planted (PAC, PMU, PHO) and two wild stands (WAC, WMU) are predominantly grouped into cluster 1, with evidence of admixed individuals from cluster 2 ( Fig 2C). WHO individuals are grouped by STRUCTURE into two separate clusters (Fig 2C). In addition, pairwise G ST estimates show a lack of differences in allele frequencies between planted and wild stands within sites, with the exception of HO (S5 Table in S1 Data). All wild populations differed in allele frequencies among them (S5 Table in S1 Data), while planted stands had a higher level of similarity among them. Genetic structure patterns were similar when clones were removed from the analysis (S6 Table in S1 Data). Paternity correlations were similar for both planted and wild progenies (r p PLANTED = 0.99; r p WILD = 0.91) ( Table 3). Therefore, the effective number of pollen donors was approximately one for both planted and wild trees (N ep = 1.01; 1.09 respectively). We identified 30 seeds with multilocus genotypes that were identical to the maternal tree (MGM), which suggests asexual seeds production (Table 4). Our kinship analysis (S7 Table in S1 Data) shows that in almost all cases, trees from within stands show elevated kinship values (diagonal of S7 Table in S1 Data). Lower kinship estimates are observed when cultivated or planted trees are compared among

PLOS ONE
The role of circa situm strategy in the conservation of genetic diversity of Spondias purpurea sites. Planted and wild trees within sites also show elevated kinship values, with the exception of WHO (Wild trees at HO) and PHO (planted trees at HO) that have a pairwise kinship of F IJ = 0.012 (S7 Table in S1 Data).

Discussion
Genetic diversity estimates in this study were comparable to those found in other tropical trees [60][61][62] and species of the genus Spondias [63][64][65]. The comparison of genetic diversity parameters between S. purpurea planted and wild stands showed no differences, suggesting that circa situm conditions may have conserved genetic diversity in adults. Our genetic diversity estimates were similar to those found in natural and fragmented Mexican populations of S. purpurea, and comparable to genetic diversity values found in other perennial edible plants (e.g., Carya illinoinensis (Wangenh.) K. Koch., Malus domestica (Suckow) Borkh., Olea europea L., Pistacia vera L., Prunus avium L., Castanea dentata (Marsh.) Borkh., Vitis vinifera L., Leucaena esculenta (DC.) Benth., Polaskia chichipe (Rol-Goss.) A.C.Gibson & K.E.Horak, Euterpe edulis Mart.) [33,37]. In all these previous cases, planted stands had comparable levels of genetic diversity to their wild relatives [37], which is contrary to expectations that domesticated species should have lower genetic diversity than their wild counterparts [66].
Our results show the presence of trees with identical multilocus genotypes (i.e., clonal individuals) both in planted and wild stands. In addition, a small proportion of progenies have multilocus genotypes that are identical to their maternal trees, which is an indirect evidence of

PLOS ONE
The role of circa situm strategy in the conservation of genetic diversity of Spondias purpurea apomictic seed formation [67,68]. Clonal diversity was similarly high in both planted and wild tree stands (Table 2). High clonal diversity in planted stands can be explained as a result of clonal propagation, which is commonplace in vegetatively propagated plants (e.g. Ficus carica L., Dioscorea rotundata Poir., Olea europaea L., Theobroma cacao L.) [69][70][71][72]. At our study sites, clonal genotypes in wild stands are scattered trees in secondary forests, probably established as a result of dispersion of asexually produced seeds. Some cultivated perennial plants have evolved from producing fruit through sexual reproduction in the wild, to parthenocarpic fruit production under different management intensities (e.g., Musa paradisiaca L., Ficus carica L., Pyrus communis L., Pistacia vera L.) [37]. This reproductive mechanism can be especially important in fragmented, human-disturbed habitats, where lower mating probabilities and changes in pollinator assemblages are common, increasing the uncertainty of animal pollination [22,73]. However, apomictic seeds may reduce the genetic diversity of future generations, particularly for species with reduced effective population sizes [74,75]. Thus, seed production in circa situm stands may not be a viable option as propagule sources for regeneration or the creation of future orchards. The genetic admixture observed in individuals of wild and planted stands at AC and MU suggests that planted trees in these sites were likely selected from seedlings or cuttings of trees in nearby wild populations. In contrast, at HO the genetic differences observed between wild (WHO) and planted (PHO) stands reflect that planted trees (PHO) may have been propagated from other locations with allele frequencies similar to those observed at AC and MU. Planted trees are commonly selected for their larger fruits, and our results suggest that these trees may have been selected from a few sources or farmers have propagated similar genotypes across all planted stands. During the process of domestication trees that bear large, fleshy, sweet fruits have been selected, and also those that can be reproduced easily from cuttings [35]. Our results indicate that planted individuals of S. purpurea may originate from a few genotypes that have become common in planted and wild stands in Guanacaste. Both our STRUCTURE and G ST results suggest that gene flow is likely to occur between wild and planted stands within sites. Furthermore, individual STRUCTURE assignments present indirect evidence of gene flow among sites; however this occurs less frequently, as evidenced by significant pairwise G ST estimates among populations. In the agro-ecosystems we studied, the source of the planted populations may have been nearby wild populations, which would explain the comparable levels of genetic diversity and the low levels of genetic differentiation observed between wild and cultivated plantings. In our study, planted stands are small orchards and live fences in close spatial proximity to wild populations, which inhabit nearby secondary forests or forest remnants. The fact that planted trees likely originated from nearby wild populations and the unconstrained gene flow among planted and wild trees, is congruent with the observed levels of genetic structure among wild and planted populations. There is evidence that gene flow can be sustained between planted plants and their wild relatives by pollinators inhabiting forest remnants [72,76]. An interesting result is the genetic structure observed between individuals in the wild trees at HO (WHO, Fig 2B and 2C). The purple cluster corresponds to individuals found at a forest remnant along a brook in a mature forest area that has never undergone management. In contrast, individuals assigned by STRUCTURE to the teal cluster were collected in a forest patch that has regenerated recently (~30-40 yrs) [39,40]. The second site represents an area under passive ecological restoration in the ACG, where fires and non-native pastures are controlled to allow for natural tree regeneration [77]. These differences in the history of both sites at HO may represent different colonization or founder effects, resulting in differences in allele frequencies that were picked up by the STRUCTURE algorithm. This underlying structure at HO also influences the pairwise G ST values that show that HO differs significantly from all other populations, supporting our previous suggestion that trees at AC and MU probably have different origins compared to HO. Therefore, population historical origin may have a greater influence on genetic diversity and better explain the structure than the simple comparison between planted and wild stands.
Previous studies have shown that S. purpurea depends on pollinators for sexual reproduction and seed production [22,38]. In this study, paternity correlations indicated that seeds from both planted and wild trees were sired by a low number of effective pollen donors and asexual reproduction. We also observed elevated pairwise kinship values within sites, which can increase the relatedness of progeny arrays. In dioecious plants, a reduced number of sires can be related to lower mate availability (e.g. skewed sexual ratios) and changes in pollinator behavior [78][79][80][81][82]. High correlated paternity estimates at AC and MU can be explained by a lower mate availability caused by female biased sex ratios (male/female <0. 26). The scarcity of males may lead to the overrepresentation of a few pollen donors in the progeny [24], increasing the relatedness of seeds. The large correlated paternity observed in the progeny of S. purpurea could also be related to pollinator behavior. In S. purpurea, a previous study showed that in disturbed habitats, floral displays are larger but pollinator visitation is negatively affected, reducing fruit-set and increasing paternity correlation [22]. Larger floral displays reduced pollinator movement between reproductive individuals, reducing the diversity of sires and increasing the relatedness of maternal progeny arrays [21,22,[82][83][84]. Cristóbal-Pérez et al. [22] showed that in fragmented habitats in Mexico, paternity correlation was high (r p = 0.63) as a result of a lower number of sires (N ep = 1.58). Our results show a limited number of sires both in planted and wild stands of S. purpurea in Costa Rica. Pollinator behavior may explain the high correlated paternity observed in the progenies at HO, where sex ratios did not statistically differ from a 1:1 ratio. These results are congruent with previous findings that habitat loss and fragmentation reduce the number of pollen donors and progeny fitness [20][21][22]83]. Planted and wild populations in this study are all subject to the negative effects of fragmentation. Although circa situm and remnant wild populations may conserve adult genetic diversity, the highly correlated paternity of their progenies may compromise the regeneration ability of seeds produced by wild and planted trees alike [83]. These results show that although circa situm may be a viable conservation strategy for adult trees, it may limit the regeneration ability of populations in disturbed habitats.

Conclusions
In summary, we evaluated the differences in genetic diversity and structure and paternity correlation between planted and wild stands of the fruit tree S. purpurea. Our results indicated that planted trees harbor similar levels of genetic diversity as wild stands. The genetic structure is explained by differences among sites, possibly due to genetic admixture between planted and wild stands within sites, and fruit production is the result of sexual reproduction from pollen flow from a limited number of males and reproductive assurance mechanisms such as apomixis. Therefore, although circa situm conditions have maintained moderate levels of genetic diversity in this species, conservation of natural populations may be necessary to increase gene flow and the diversity of pollen donors, which may result in higher fitness of future generations [21,83]. This in turn, increases the likelihood that populations will persist in the long term. If large undisturbed habitats are not conserved, the genotypes preserved in circa situm may only represent a subset of the original genetic diversity of the species and may falsely be considered a genetic reservoir, without considering its relationship with the overall fitness of the species. Therefore, to maximize the effectiveness of circa situm strategies in dioecious plants the vegetative propagation of trees should originate from spatially distant individuals to promote high genetic diversity and should strive to maintain equal sex ratios to guarantee sexual reproduction and more diverse progenies in the long term.